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Abstract 

'qq". In this paper, a generalization of the traditional point-to-point to communication setup, which is named as "reliable commu- 

^— «j ' nications with asymmetric codebooks", is proposed. Under the assumption of independent identically distributed (i.i.d) encoder 



o 



codewords, it is proven that the operational capacity of the system is equal to the information capacity of the system, which is 
] given by maxpj^,) I{U ; Y), where X, U and Y denote the individual random elements of encoder codewords, decoder codewords 

and decoder inputs. The capacity result is derived in the "binary symmetric" case (which is an analogous formulation of the 
traditional "binary symmetric channel" for our case), as a function of the system parameters. A conceptually insightful inference 
O ' is made by attributing the difference from the classical Shannon-type capacity of binary symmetric channel to the gap due to the 

I codebook asymmetry. 

I. Introduction 

We consider a point-to-point communication problem, where the codebooks of the encoder and the decoder are not the same, 
^ i.e., there is an asymmetry between them. In particular, we consider the scenario where the decoder's codebook is a perturbed 



version of the encoder's codebook; the statistical characterization of this perturbation is fixed and known by both the encoder 
and the decoder. Thus, the problem we consider can be viewed as a generalized version of the original reliable point-to-point 
^\ . communications problem stated by Shannon [1]. 
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. In our proposed setup, we aim to model a scenario, where the encoder (say, party A) and the decoder (say, party B) 

o , 

OO . belong to collaborating, yet two different entities, which are communicating with each other: A would like to send a message 
O 

J> to B. However, for the reasons of privacy (a clarifying practical example will be provided shortly), A does not want to 
share its codebook with B, while still maintaining a reliable communications Unk. Hence, due to the constraint of "rehable 
communications", B necessarily needs to possess a codebook which is "somehow related" to the codebook of A. In our case, we 
assume that, this "codebook relationship" is simply captured by a joint distribution of the codewords of both codebooks. In our 
formulation, we assume that the conditional distribution of the codewords of the codebook of B, conditioned on the codewords 
of the codebook of A, is fixed. Therefore, due to the Bayes' rule, the only design parameter is the marginal distribution of the 
codewords of the codebook of A. Our fundamental goal is to derive and characterize the maximum achievable rate of reliable 
information transmission between A and B in the aforementioned setup. 
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A practical problem, which is closely related to the proposed setup, is robust signal hashing which is an area of research 
interest, particularly in signal processing and multimedia security. In the problem of robust signal hashing, the goal is to find 
a practical solution to "content tracking for anti-piracy search" with the aid of some side information at the receiver end. In 
that case, A is the owner of a valuable set of signals and would like to reUably find out whether any element of this set has 
been used without a proper consent; thus, A would like to keep track of such prohibited uses. Furthermore, usually the content 
owner A does not have the necessary resources to perform a desired anti-piracy search, and hence needs to utilize the resources 
of another entity (which accounts for B in the aforementioned setup). As a result, A would like to form a collaboration with B 
to carry out anti-piracy search, but at the same time does not want to reveal the "private" content itself due to its value (which 
accounts for the privacy issue in the aforementioned setup). Therefore, B should perform the anti-piracy search only with the 
aid of the side information provided by A about the original content. Here, the side information provided to the decoder (also 
termed as the receiver throughout the paper) side is termed as the "hash values" of the original content and the method via 
which they are constructed is termed as "the robust signal hashing algorithm" in robust signal hashing literature. 

Adapting an information-theoretic approach to the robust signal hashing problem, we view the content owned by A as 
its codebook. The message transmission phase of the information-theoretic setup corresponds to publishing or broadcasting 
(without getting proper consent from A) the codewords of the encoder's codebook, possibly after introducing some disturbance!^ 
The codebook made available to B (which is a "perturbed" version of the codebook of A) represents the " side information" 
(termed as the "hash values" of the valuable content in the multimedia security literature), using which the anti-piracy search is 
to be carried out. Note that, the statistical characterization of the perturbation between the codebooks of A and B represents the 
"robust signal hashing algorithm". As a result, the proposed problem of reliable communications with asymmetric codebooks 
constitutes the fundamental upper bound on the performance (i.e., the maximum rate of error-free information transmission) of 
any given robust signal hashing algorithm. We refer the interested reader to [7], [8], [9] for some practical robust signal hash 
algorithms proposed in the literature and [10] (resp. [11]) for a detection (resp. decision) theoretic treatment of the problem. 

Next, we compare and contrast the reliable communications with asymmetric codebooks problem with the existing "related" 
formulations in the literature. First, observe that the problem at hand may be thought to belong to the class of traditional side 
information problems of Shannon theory (e.g. [3], [4], [5], [6]). However, this is indeed not the case due to the presence of 
asymmetric codebooks in our setup (which does not exist in the class of side information problems). For the case of "side 
information problems" dealing with channel coding, observe that the side information is about the system parameters, and/or 
the noise corrupting the message, but the codebooks employed in the system always shared between the parties, in other 

' The incorporation of a disturbance is modeled by the presence of a "noisy communication channel" in our problem, cf. Section III-BI 
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words, either the transmitter or the receiver is favored by the usage of the provided side information which is not available 
to the other party. However, for the proposed problem of reliable communications with asymmetric codebooks, there is not a 
shared codebook between the two communicating parties, (as hinted by the name of the proposed problem) and the system 
parameters (which amount to the statistical characterization of the system variables in our case, cf. Sec. III-Bb are precisely 
known by both the encoder and the decoder. As a result, we believe that the proposed setup does not trivially reduce to the 
known classical problems of Shannon theory. 

Main Results: For the proposed problem of reliable communications with asymmetric codebooks, we particularly focus on 
the scenario where: 

(i) the alphabet, from which the encoder's codewords are drawn, is discrete and finite, 

(ii) the communication channel between the encoder and the decoder is memoryless, 

(iii) the statistical characterization of the perturbation between the codebooks (also termed as the "codebook perturbation" 
throughout the paper) of the encoder and the decoder is memoryless, 

(iv) the codewords, which constitute the encoder codebook, are realizations of an i.i.d. (independent identically distributed) 

j1 

proceso 

Under these conditions, the main results of the paper are as follows: 

(i) We derive the maximum rate of error-free information transmission (per communication channel use), termed as the 
"asymmetric codebook capacity" (cf. Theorem 13. It : it is shown to be the maximum of the mutual information between 
the decoder's codeword and the communication channel output, where the maximization is carried out over the probability 
distribution of the encoder's codeword. 

(ii) We evaluate the asymmetric codebook capacity for a special case of interest (termed as "binary symmetric case"), where 
the encoder alphabet is binary, the codebook perturbation is a binary symmetric distribution, and the communication 
channel is a binary symmetric channel (BSC). 

We begin our developments by stating the notation utilized in the paper and providing a rigorous statement of the problem 
formulation in Sec. In Section |III] we state the main result of the paper: The forward and converse statements' proofs are 
given in Sections IlII-AI and IIII-BI respectively; a closed form expression of the asymmetric codebook capacity for the binary 
symmetric case is presented in Sec. IIII-CI Paper ends with the concluding remarks in Sec. IIVI 

^The detailed justification of this assumption is given at the beginning of Sec. |lll] 



II. Notation and Problem Statement 

A. Notation 

Boldface letters denote vectors; regular letters with subscripts denote individual elements of vectors. Furthermore, capital 
letters represent random variables and lowercase letters denote individual realizations of the corresponding random variable. 
The vector [ai, 02, ... , a^]^ is compactly represented by a^. The abbreviations "i.i.d.", "p.m.f.", and "w.l.o.g." are shorthands 
for the terms "independent identically distributed", "probability mass function", and "without loss of generality", respectively. 
For a discrete random variable X, with the corresponding p.m.f. denoted by p [x] (where the subscript X is omitted for 
simplicity, and should be evident from the context) defined on the alphabet X, H {X) — — X^KeA"? (■^) l^gp {x) denotes its 
entropjo. Similarly, given discrete random variables X and Y , the quantities H {X, Y), H {X\Y), I {X; Y) denote the joint 
entropy of X and Y , conditional entropy of X given Y , and the mutual information between X and Y , respectively. As a 
shorthand, binary entropy function is denoted by H (p) ^ — p\ogp — (1 — p) log (1 — p) for p e [0, 1]. 

B. Problem Statement 

In this section, we state the precise definition of the proposed problem of reliable communications with asymmetric 
codebooks. Such a communication system consists of two components: a discrete-memoryless communication channel denoted 
by {X ^p{y\x) ^y) (with single letter input alphabet X, single letter output alphabet y, single letter transition probability 
p{y\x), cf. see [2], p. 193) and a (2"^, n) asymmetric channel code^ (cf. Definition 12. 11 1. 

Definition 2.1: A (2"^,n) asymmetric channel code denoted by {X ,CxiP{u\x) ,lJ,Cu) consists of the following compo- 
nents: 

(i) a message set, W = {l, . . . , 2"^}, 

(ii) an encoder codebook, Cx G X^ "xn^ consisting of length-n, 2"^ codewords, {x" (i)}^^]^, each of which j-th element 
is denoted by Xj (i), I < i < 2"^, I < j < n. 

(iii) an encoding function, / : W ^ X^\ where / (i) — x" (i), 

(iv) a decoder codebook, Cjj £ consisting of length-n, 2"^ codewords, {u" each of which j-th element is 
denoted by uj (i), 1 < i < 2"^, 1 < j < n, such that C(j is formed from Cx via a probabilistic mapping (i.e., statistical 
perturbation) in a memoryless fashion, where 

Pr {Cu\Cx) = p (u" (1) , u" (2) , . . . , u" (2"^) | x" (1) , x" (2) , . . . , x" (2"«)) = [] P (0 1^" (0) ' (D 

i=l 

^Unless otherwise stated, all the logaiithms are base-2. 

'^Throughout the paper, for the sake of convenience, we assume that 2"^ £ Z+ for all _R 6 R+ U {0} and for any n € Z+. 



and for all 1 < i < 2"^, 

n 

p(u" (z) |x" {^)) = n ' (2) 

(v) a decoding function, 5 : 3^" W U {0}, which is a deterministic mapping that assigns a decision (including a "null", 
denoted by 0) to every received sequence y" G y" via utilizing Cu\ the decoder output is denoted by W, 

(vi) a receiver side message-to-codebook mapping , h -.W W\ where h (i) — u" (i) for all 1 < i < 2"^. 

Next, we state several "error-event-related" definitions, which will be used throughout the paper. Note that, in all these events 
we condition on the particular realization of the decoder codebook Cu and explicitly state this in the notation. 

• Conditional probability of error, Xi, conditioned on the transmitted message i and the decoder codebook Cjj'- 

A.(Ca) =Pr(5(Y")^*|/i(*) = u"(*))= ^ p(y"|u"(*))l(,(y„)^,), (3) 

where is the standard indicator function. 

• Maximal probability of error, A*^"^, conditioned on the decoder codebook Cu: 

A(") (Cc/) = maxA,. (4) 

(n) 

• Average probability of error, Pe , conditioned on Cu- 

pH^ Pr(M^^g(Y")). (5) 
The block diagram representation of the asymmetric communication system defined above is shown in Fig. [T] below. 
WeW > /(•) 



x"(W) ► p(y\x) 



g(-) "WeWUm 



c=[x"(w)] c=[u"m] 

Pin" ) = X )p(u" I X" ), s .t. p{vi" I X" ) = fl I X, ) 

X" !=1 

Fig. 1. The Block Diagram Representation of tlie Discrete Memoryless Cliannel with Asymmetric Codeboolcs. 

Remark 2.1: 

(i) The quantities X, U, y represent the alphabets, from which the individual elements of the encoder codewords, the decoder 
codewords and the decoder inputs are drawn, respectively. 

(ii) The p.m.f. p {y\x) represents the (discrete memoryless) communication channel between the transmitter and the receiver, over 
which the transmission of information is carried out. 

(iii) The p.m.f. p {u\x) represents the perturbation distribution between the codewords of the encoder ({x" («)}j) and the decoder 
({u" which statistically characterizes the asymmetric nature of the codebooks of these components {Cx and Cu)- 
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(iv) Due to the physical nature of the problem, the generation of the receiver input and the generation of the receiver codebook, 
given the encoder codebook, are two separate independent events; in general, they do not necessarily need to happen at the 
same time, which highlights a fundamental difference between our setup and a "broadcast-channel-like" setup. Hence, given 
the random variables, X(^X,YEy,UEU, obeying the conditional p.m.f.s p {y\x) and p {u\x), we have 

p{y,u\x) = p{y\x)p{u\x) , (6) 

which implies that U X Y forms a Markov chain in the stated order. 

(v) Both the encoder and the decoder possess the knowledge of p{y\x), p{u\x) and p{x); on the other hand, the particular 
codebook realizations, Cx and Cu are known only by the encoder and the decoder, respectively. 

(vi) The functions / (•) and h (•) effectively determine the "ordering" of the rows of the codebook matrices Cx and Cu, respectively 
(w.l.o.g., the elements of the message set W = {l, 2, . . . ,2"^} are thought to be ordered in an increasing fashion at both 
sides). Since the encoder and the decoder only know the particular codebook realizations Cx and Cjj, respectively, equivalently 
they also only know the functions / (•) and h (•), respectively. Thus, the asymmetric nature of the problem arises from the 
mismatch between / (•) and h{-). 

(vii) In this paper, the statistical perturbation that models the mapping from Cx to Cjj is assumed to be memoryless for the sake of 
simplicity, which should be thought as a first step towards the direction of analyzing reUable communications with asymmetric 
codebooks. In the more general case, the mapping from Cx to Cu can be arbitrary, which constitutes part of our future research. 

III. Discrete-Memoryless Channel With Asymmetric Codebooks, I.I.D. Case 

In this section, we proceed with analyzing the communication system presented in Section III-BI under the following 
assumption: the codewords, {x" of the encoder's codebook, Cx, are realizations of an i.i.d. random process with 

some marginal distribution p{x). The main reason for this assumption is the fact that for the case of the dependent codewords 
of Cx, we have dependency between the pairs {(u, {W) ,Yi)} for all G W and 1 < i < n (cf. Lemma [TTT l. which 
necessarily implies the existence of memory in the overall communication systenjfl Since we treat the current paper as a first 
step towards achieving the goal of investigating the problem of "reliable communications with asymmetric codebooks", we 
currently confine ourselves to the setup of i.i.d. encoder codewords for the sake of simplicity; analyzing more general cases, 
which include both asymmetric codebooks and memory, constitutes part of our future research. 

Under the aforementioned assumptions of i.i.d. encoder codewords, memoryless perturbation distribution, and memoryless 
communications channel, we call the resulting system as discrete memoryless communications channel with i.i.d. asymmetric 

^Note that as far as decoding is concerned, the overall setup may intuitively be thought to be analogous to a communication system, where both the encoder 
and the decoder share the same codebooks Cjj, and the communication channel via which the transmission of information is carried out is represented by the 
conditional p.m.f. p(y"|u"). Note that the overall setup is 7wt equivalent to the aforementioned communication system, since the design parameter in the 
setup of interest is p{x). This issue will be further discussed in Remark [3.51 



codebook^ Our fundamental result is Theorem 13.11 where we state the channel coding result for the resulting system. 
Section ITlI- Al contains achievability result, while Section lTlI-Bl is devoted to the converse of Theorem 13. II Section Hill concludes 
with the evaluation of the capacity for a special case, where the encoder codewords are binary, the communication channel 
is a binary symmetric channel and the perturbation distribution is a binary symmetric distribution, which is the topic of 
Section Hira 

Definition 3.1: An i.i.d. (2"^, n) asymmetric channel code for the discrete memoryless communications channel {X^p {y\x) , y) 
consists of the six components mentioned in Definition 12. 1 1 with the following additional property on the codewords of Cx- 



for all w e W, 1 < z < n. 

Now, we define the achievable rate and the operational capacity of the discrete memoryless channel with i.i.d. asymmetric 
codebooks. 

Definition 3.2: A rate R is said to be achievable provided that there exists a sequence of i.i.d. (2"^, nj asymmetric channel 
codes, such that the corresponding maximal probability of error (cf. ©) A(") {Cu) as n — > oo. 

Definition 3.3: For any given discrete memoryless channel {X,p {y\x) , y), the operational capacity of the discrete memo- 
ryless channel with asymmetric codebooks, is defined as the supremum of the achievable rates. 

Next, we define the information capacity of the discrete memoryless channel with i.i.d. asymmetric codebooks, which will 
be shown to be equal to the operational capacity of the system. 

Definition 3.4: For any given discrete memoryless communications channel {X,p{y\x) , y), the information capacity of the 
discrete memoryless channel with i.i.d. asymmetric codebooks, is defined as 



C= maxI{U;Y), (7) 



where y) = Y^xex Pi^)Piy\^)Pi'^\^) (^f- ©)■ 



Remark 3.1 : We note two observations: First, I {U;Y) > is upper-bounded by some finite value since both U and y 
are discrete finite sets. Second, the set of all probability vectors p{x) belong to a closed and bounded subset of [0,1]' '. 
Combining these observations, we deduce that the maximum in (I?) exists. 

Remark 3.2: Note that, the quantity / {U ; Y), the argument of is also the argument of the maximization problem 
corresponding to the classical channel capacity expression 

max/(J7;r), (8) 

p(u) 

*We show in Lemma |X21 that, under the specified assumptions, the elements of the decoder codebook are also i.i.d. 
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where U and Y are the communication channel input and the corresponding channel output, respectively. While the arguments 
of (|2]l and (|8]l are the same, the optimization parameter is different ( p (x) and p (u) in the former and latter, respectively). 
Thus, relating our problem to the classical channel capacity problem, the result (|7|i is quite intuitive: The argument I {U;Y) 
represents the amount of information that can be transmitted through the channel since U and Y denote the decoder's codebook 
random variable and the communication channel output random variable, respectively; on the other hand, the maximization is 
carried out over p {x), the distribution of the encoder codebook random variable, which is the only design parameter. 

Theorem 3.1: (Channel Coding Theorem With Asymmetric Codebooks) For a discrete memoryless channel with i.i.d. 
asymmetric codebooks, all rates below capacity C are achievable. Specifically, for every rate R < C, there exists a sequence 
of (2"^, n) i.i.d. asymmetric channel codes with maximum probability of error can be made arbitrarily small for sufficiently 
large n. 

Conversely, any sequence of (2"^,n) i.i.d. asymmetric codes with asymptotically vanishing maximum error probabiUty 
should necessarily satisfy R < C. 
Remark 3.3: Using the chain rule for mutual information, we have 

IiU,X;Y) - I{U;Y)+I{X;Y\U), (9) 
= IiX;Y)+I{U;Y\X), (10) 

Combining Q, ([TOl) and noting that / {U; Y\X) = (cf. Q), we get 

IiU;Y) = I{X;Y)-IiX;Y\U). (11) 

Recall that, if there is no perturbation between the encoder and decoder codebooks (i.e., Cjj = Cx), the proposed system reduces 
to the conventional channel coding setup, in which case (given p {x}) the achievable rate is I (X;Y). Hence, inspecting (fTTI) . 
we observe that the term / {X; Y\U) > can be viewed as the achievable rate loss due to the asymmetric codebooks. 

Furthermore, / {X\ Y\U) = if and only if p {x, y\u) = p {x\u) p {y\u), i.e. X ^ U ^ Y forms a Markov chain. Since we 
also have U ^ X ^Y (cf. Q), this is possible if and only if there exists a one-to-one mapping between U and X. 

Next, we state following lemma, which states that in case of i.i.d. encoder codewords, the pairs, consisting of the individual 
elements of the decoder codewords and the corresponding communication channel outputs, are i.i.d. This result shall be used 
in proving the achievability and converse theorems. 

Lemma 3.1: Given p(x") = n"=iP(^i)' ^ discrete memoryless communications channel {X,p{y\x) ,y), and a (2"^, n) 
asymmetric channel code {X^CxtP{u\x) ,U,Cu), we have 

n 
i=l 



where p {y, u) = ^27=1 Pi^)p{y\^)p{u\^)- 

Proof: See Appendix m ■ 

A. Achievability 

Theorem 3.2: (Achievability) For every rate R < C, there exists a sequence of (2"^, n) i.i.d. asymmetric codes with 
arbitrarily small maximum probability of error for sufficiently large n. 

Proof: The proof relies on the random coding arguments. First, we state the achievable rate for any given p (x). 
Encoding: 

(i) Generation of Codebooks: Fix p{x) and reveal to both sides. Generate the encoder codebook Cx as stated in Definition l2.1l 
part (ii), such that Xi{w) are i.i.d. realizations of X of which distribution is p{x) for alH e {1, . . . , n}, w G W. Construct 
the decoder's codebook Cu, as stated in Definition 12.11 part (iv), using the conditional p.m.f. p{u\x). 

(ii) Choose a message w uniformly from W (i.e., Pr {W ~ w) = 2^"^^ for all w G W). Then / (w) ~ x"(u>) is transmitted 
over the communication channel p{y\x), resulting in Y", such that Pr (Y" — y"|x" (w)) — Y[i=iP iUil^i (^)) (recall 
the memoryless property of the communication channel). 

Decoding: 

(i) Note that, {ui (W) , Yi}^^^ pairs are independent of each other (cf. Lemma ITTT l. where Y" is the communication channel 
output corresponding to the message W G W. Next, we use jointly typical decoding: Decide the unique W ^ W {if 
exists), such that (u" {w^ ,Y") G {U,Y), where A^"^ {U,Y) (from now on denoted by A^"' for the sake of 
simplicity) is the e-jointly-typical set [2], defined on p{u,y) = J2x<£X P P ivl-'^) P (^\^)' 

4")= |(u",y") : |--logp(u")-iJ(C/)| < 6, I- llogp(y")- I <e, I- ilogp(u",y")- i7(C/,y) I < el 
[ n n n J 



(13) 



If such a G >V is not unique or does not exist, then declare g (Y") = 0. The error event is defined as 



£^ ^wy (14) 



Analysis of the Probability of Error: 



Observe that, using the uniform distribution of W over W, we have 

2' 



Pi") = ^ Pr (g (Y") ^ ^\h{^) = u« (*)) Pr (W^ = z) = ^ ^ A. 

i=l i=l 

Next, we show that the elements of the codewords of the decoder codebook are i.i.d. 



(15) 
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Lemma 3.2: 

n 2"« 



i—l w—1 



where p{u) = '^xP^^)Pi'^\^)- 

Proof: See Appendix HH ■ 
Recalling the definitions (|T4l i and (|5]), and using ( fTsT i, we have the following average probability of error, averaged over all 
possible decoder codebook^ : 

Pi") = Pr(£), 

= ^Pr(C^)p(")(Ca), 

tu=l Cc/ 

= ;^Pr(C[/)Ai(Cc/), (17) 

Cu 

= Vr{E\W = l), (18) 

where ( fTTI l follows since the decoder's codebook generation is symmetric per Lemma [3^ Thus, w.l.o.g., from now on we 
confine ourselves to the case of = 1. 
Next, we define the following events of joint typicality of (u" (i) , Y") in case ofW = l: 

{(u"(i),Y") |t4^ = l}, (19) 

fori e {l,...,2"^}. 
As a result, we have 

Pi") = Vi{£\W^l), (20) 
= Pr [£1ij{j£,\W = 1^ , (21) 

< Vi{£l\W ^1)+Y,Vt{Ej\W ^l). (22) 
i=2 

where ( [20b follows from ( fTSl ). ( |2TI ) follows from the definition ( |T9] l, ( |22] | follows from the standard union bound. Next, we 
provide upper bounds for the terms constituting the right hand side of ( |22] ): 

Vr{£1\W=\) < e, (23) 
Pr {£j\W = 1) < 2-"('^('^''*')-3e)^ fQj. ^jjy j e {2, 3, . . . , 2"-"} (24) 

'Recall that since the marginal probability of Cjy is the result of averaging the joint distribution of Cx and Cu over Cx, (cf- Lemma[3]2) we equivalently 
average out the conditional probability of error expression over the two codebooks of the system, since the probability space of Cij is jointly induced by the 
perturbation distribution and the probability space of X. This point constitutes the fundamental difference between the achievability proof for our system and 
the classical achievability proof of channel coding. 
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for any e > and sufficiently large n; here ( |23T l and (l24l i follow from the joint AEP theorem (cf. Theorem 7.6.1. of [2]) since 
u" (i) and u"(l) are independent for i 1 (cf. Lemma |372] i. Using ( |23] l and ( |24] | in (l22T i yields, 

^ e + 2-«(^(^^^)-«-30^ (25) 

for any e > and sufficiently large n. Note that, (IZST i implies that if / ([/; F) — i? > 3e, we have 

Pi") < 2e, (26) 

for sufficiently large n. Thus, for any rate R < I {U; Y), there exists a sufficiently small e and a sufficiently large n, such 
that (|26] l holds. Next, choose p(a;) in the encoding step so as to maximize I {U;Y); let p* (x) be a maximizer Then the 
condition R < I {U; Y) can be replaced by the achievabiUty condition R < C. Next, following similar steps to those used in 
the achievability proof of the classical channel coding theorem (cf. [2], p. 204), we conclude that we can construct a code of 
rate R— 1/n with maximal probability of error A*^") < 4e for any e > for sufficiently large n. ■ 
Remark 3.4: Note that, since the decoder knows the marginal distribution of the encoder's codewords, it "typically" knows 
(due to the AEP) the codewords of the encoder as a "cluster". Furthermore, due to the availability of the statistical charac- 
terization of the communication channel to both sides, the decoder also knows the jointly typical (x",y") pairs, again as 
a "cluster". However, since the decoder does not know the precise "ordering'^ of the encoder's codebook (which uniquely 
determines /(•)), the clustering information by itself does not yield anything useful as far as decoding is concerned: The 
decoder may very well find out the particular codeword x" which is jointly typical with the received y"; however in the 
absence of the encoding function / (•), it can not calculate /^^ (x") in order to give an estimate of the transmitted message. 
Hence, the only tool decoder may use in order to perform detection is the codebook, Cu, made available via the usage of the 
perturbation distribution, and the resulting function /i (■). 

B. Converse 



In this section, we provide the converse of Theorem 13.11 which is stated below: 
Theorem 3.3: (Converse) For any (2"-'', nj i.i.d. asymmetric codes with A'") 0, we have R < C. 
Proof: The proof consists of three steps: 

Step 1: We first show that, for (2"^, nj asymmetric codes with A*^") 0, we necessarily need to have that h (•) is one-to-one 

(cf. Lemma [33b . 

^Recall that the encoder codebook, Cx is not available at the decoder. 
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Step 2: Then, we show that, for (2"^,n) i.i.d. asymmetric codes with Pi"-* and h{-) one-to-one, we necessarily have 
R<C. 

Step 3: Next, we note that every (2"^,rij i.i.d. asymmetric code with A*^") — > should also necessarily satisfy Pi"-* 0. 
Since such codes also must satisfy the property of h (•) being one-to-one (per Step 1 above, by recalling that the set of i.i.d. 
asymmetric codes is a subset of asymmetric codes), these codes satisfy the conditions specified in Step 2. Thus, proving the 
statement of Step 2 above constitutes a sufficient condition for the converse theorem. 
We proceed with proving the following lemma which establishes Step 1 above. 

Lemma 3.3: For any (2"^, nj asymmetric code with A*^"-* — > 0, h{ ) is necessarily a one-to-one mapping. 

Proof: See Appendix Hill ■ 
Next, we continue with completing the second step of the proof: We show that, for (2"^,7i) i.i.d. asymmetric codes with 
pi"'' and h{-) one-to-one, we necessarily have R < C. 

First note that, the transmitted message, W, and the communication channel output, Y", have a joint distribution; and that 
the decoder output, is a function of the communication channel output, Y". Hence, we conclude that W ^ Y" ^ W 
forms a Markov chain in the specified order. Furthermore, since h (W) = u" (W) is a function of W, we also see that 
u" (VF) ^ W ^ Y" forms a Markov chain in the specified order. Combining these two observations, we conclude that 
u" (VF) ^ W ^ Y" ^ W forms a Markov chain in the specified order Next, since h (•) is one-to-one per assumption, the 
previous Markov chain further implies that W ^ U" ^ Y" <^ W forms a Markov chain in the specified order, where U" 
denotes u" (W) for the sake of simplicity. 

Now, we continue with investigating the pair U", Y" in the following chain of equalities: 

p(y",u") 



p(y"|u" 



p(u") ' 
p(u") 



(27) 
(28) 



n 

= l[p{y^\u^), (29) 

i=l 

where dZTl i follows from (fTST i. (l28T l follows from (O and the fact that p (x") = 11"= i Pi^i) ™d •EH* follows from recalling that 
p{y, u) — 'J2x- P{y\^)p{'^\^)pi^) ^iid noting that p{yi\ui) — ^ — p(u^'\]^^]p(x ) B^Y^s rule. Equipped with the memoryless 

property of p(y"|u") (cf. (|29]l), we continue with the following chain of inequalities: 

nR = H{W), (30) 
= I{W\W) + H{W\W)^ 
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< /(U"; Y") + (l + ni?Pi")) , (31) 

n 

= i/(Y")-^i7(r,|C/,)+(l + ni?Pi")), (32) 

Z^l 

71 

= (l + ni?Pi")) (33) 

71 

1=1 

where ( [30l ) follows since W is uniformly distributed over W, ([3T) follows using Fano's inequality (the second term) and 
the data processing inequahty [2] (the first term) by recalling that W ^ U" ^ Y" ^ W forms a Markov chain in the 
specified order, ( |32] | follows using (|29] l. ( l33T l follows since p(y") — n"=i P(y'l^«)-P(^i) = IliLiPly') (because of the 
fact that communication channel is memoryless and p(x") = Y[i=iPi^i)^^ which implies that (Y") — X]"=i ^(^«)' ^'^'^ 
(|34] | follows using the definition of mutual information. 

Next, we aim to upper bound the second term of (|34] |. We proceed with noting that 

I{U;Y) = j:p{u,y)\ogP^, 

u,y p^y' 

P{x:)p{y\x)p{u\x) 

where (l35T l follows using dSJ. Note that, the only variable that can be "adjusted" in ( l35T l is p{x) (the variables p(y|a;) and 
p (u\x) are given and fixed); hence, we conclude that 

I{U,; Yi) < max/(C/; Y) = C, for all (f/,, F^) pairs. (36) 



Using ( |36] | in (|34] |. we have 



i? < - + pp(") + C, 



< e + C, (37) 

for any e > and sufficiently large n, where (37) follows since Pi"'' and 1/n < e for sufficiently large n. Therefore, 
( l37T i implies R < C, which concludes the proof, since it is a sufficient condition for the validity of the statement: 'for any 
i.i.d. asymmetric code with A*^"' 0, we necessarily have R < C", per Step 3. ■ 
Remark 3.5: 

(i) Note that the fact "W ^ U" ^ Y" ^ W forms a Markov chain", also has an intuitive explanation: From the 
receiver side, the only channel (a hypothetical channel, depends on the choice of p{x)) between the encoder and the 
decoder is p(y"|u") since there's a deterministic relation between PV, x"(VF), u"(H^), recalling the formation of the 



14 

decoder's codebooly- Therefore, although the transmitter sends x"(?i;) using the original communication channel p{y\x), 
the effective situation from the receiver side is as follows: The transmitter sends u.''^{W) using the hypothetical channel 
p(y"|u") and the channel p(y"|u") is known at the decoder's side (since p{x), p{y\x) and p{u\x) are all known by the 
decoder); therefore, the resulting problem at hand is analogous (although not equivalent) to the classical point-to-point 
communication variant mentioned in Remark 13.21 (the difference being the argument over which the maximization is 
carried out). 

(ii) The meaning of the hypothetical channel p(y"|u") mentioned in item (i) of this remark can be explained as follows: 
Since the received Y" is due to sending x"(T/F) through the channel p{y\x), and the corresponding u"(T/F) is produced 
via perturbing x"(VK), we observe that the dependence of Y" on u"(M^) is over x"{W). Therefore, the decoder (in 
some sense) "derives an estimate of" x"{W) first (through the usage of p(x"|u"), which can be evaluated at the receiver 
side, since p{x) and p{u\x) are available at the decoder), and then decides on W using p{y\x) and the aforementioned 
"derived estimate of" x"{W). 

C. Binary Symmetric Case 

In this section, we consider a specific example of the discrete memoryless channel with i.i.d. asymmetric codebooks, which 
is shown in Figure[T] In particular, we consider the case for which X = y = U = {0,1}, the communication channel (resp. the 
perturbation distribution) is the binary symmetric channel (resp. the binary symmetric distribution) with crossover probability 
Pi (resp. p2). Thus, we have 

Y = X®Zi, (38) 

where Pr (Zi = 1) = pi and Pr (Zi = 0) = 1 - pi, 

U = X®Z2, (39) 

where Pr {Z2 = 1) = P2 and Pr {Z2 = 0) = 1 — p2, where © denotes addition modulo 2. 

Theorem 3.4: For the binary symmetric case of the discrete codebook channel with i.i.d. asymmetric codebooks, the capacity 
is given by 

C= l-ff(pi+p2(l-2pi)), (40) 

where capacity is achieved if and only if X is a Bernoulli 1/2 random variable, i.e., Pr {X — 1) = Pr {X — 0) — 1/2. 
Proof: First of all, we define the auxiliary random variable V as follows: 

V = X®Zi®Z2. (41) 

'Note that this deterministic relation does not contradict with the existence of the probabilistic mapping, i.e. perturbation distribution, in the formation of 
the decoder's codebook. This deterministic mapping points out the relation between W, x"{W) and u"(VF) after the formation of both Cx and Cij- 
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Next, we provide the following lemma: 

Lemma 3.4: U ^ V ^ Y forms a Markov chain, i.e. 

p{u,y\v) ^ p{u\v)p{y\v). (42) 



Proof: See Appendix IIVI ■ 
Next, observe that from the definition of auxiUary random variable, V (cf. dTTT i). we see that X ^ U ^ V forms a Markov 
chain in the specified order, i.e., 

p{x,v\u) — p{x\u)p{v\u), (43) 
and X ^ Y ^ V forms a Markov chain in the specified order, i.e., 

p{x,v\y) = p{x\y)p{v\y). (44) 

Furthermore, combining ^ and ( |42] | yields the following "circular Markov chain" structure between X, Y, V, U, which helps 
to visualize the situation at hand better (such a construction seems to be new to the best of our knowledge): 

u 

Fig. 2. The circular Markov ch ain str ucture of the random variables X, Y, V, U, defined for the binary symmetric communication channel and the 
perturbation distribution of Section UlI-CI 

Using the definition of mutual information, we have 

I {V;U) - I {V;U\Y) ^ H (U) - H {U \V) - H {U \Y) + H {U \V,Y) , 
= I{U;Y)-I{U;Y\V), 

= nU;Y), (45) 

where ( l45T l follows since U ^ V ^ Y forms a Markov chain in the specified order (cf. (|42] |). Similarly, using the definition 
of mutual information, we have 

I{V;Y)-I{V;Y\U) = H {V) - H {V \Y) - H {V \U) + H {V \U,Y) , 

= H{V)+H{V\U,Y)-H{pi)-H{p2), (46) 
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where (|46j follows using (|38j, (l39j and (gB. Noting that, 

IiU,Y;V) = IiV;U)+IiV;Y\U), (47) 
= /(T/;r) + /(F;;7lr), (48) 

due to the chain rule for mutual information, we get 

I{V;U)~IiV;U\Y)^I{V;Y)-I{V;Y\U). (49) 

Using ( |45] ) and ( l46t in the left hand side and the right hand side of ( l49b . respectively, we have 

I{U-Y) = H{V)+H{V\U,Y)-H{p,)-H{p2). (50) 

Next, we proceed with evaluating H {V \ U,Y). Using the chain rule for entropy, we have 

HiX,U,Y,V) = HiU,Y\X,V) + H{X,V), (51) 
= H{X,V\U,Y) + H{U,Y), (52) 

Evaluating the individual terms in ( fSTT l and ( |52] |. we get 

i7(C/,r|X,F) = HiU\X,V) + HiY\X,V), (53) 

= + (54) 

HiX,V\U,Y) = i7(X|C/,r)+i7(y|C/,r), (55) 

= HiX,U,Y)-H{U,Y) + H{V\U,Y), (56) 

= i7(c/,r|x) + i?(x)-i/(t/,y) + i/(T/|c/,r), (57) 

= HiU\X) + HiY\X) + HiX)- H{U,Y) + H{V\U,Y), (58) 

where ( |53] ) follows using ( l43T l, ( l54b follows from the chain rule for joint entropy, ( |55] | follows using (l44l) . ( |56] | and ( fSTl ) follow 
using chain rule for entropy, ( fSSl ) follows using (|6]l. 

Now, it is time to sum up things. Using i53[ and ( |54] | in (BTI) . using (fSSl) in (|52] |. and equating ( BTT l and ( |52] | yields: 

ff(l^|t/,y) = H{U\X,V) + H{Y\X,V) + H{V\X)~H{U\X)~H{Y\X), 

= H{U\X,V) + H{Y\X,V) + H{V\X)~H{p^)~H{p2), (59) 

where ( [59] l follows using ( [38] l and ( [39] l. 
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Now, we evaluate the remaining terms in ( |59] l. First, observe that, using the definition of auxiUary random variable V, we 
have 

p{v = k\x = k) = I - pi - P2 + 2pip2, (60) 

p{v^k\x = k) = pi + P2 - '2,pip2, (61) 

where fc = 1 fc for all k G {0, 1}. Hence, 

H{V\X)^ H{pi+p2-2piP2). (62) 
Next, we proceed with evaluating the first term on the right hand side of ( |59] | 

H{U\X,V) = H{X,U,V) - H{X,V), (63) 

= H{X,V\U)+H{U)-H{X,V), (64) 

= H{X\U) + H{V\U) + H{U) ~ H{V\X) - H{X), (65) 

= H{X,U) + H(y\U) - H{V\X) - H{X), (66) 

= H{U\X) + H{V\U) - H{V\X), (67) 

= H{p2)+H{pi)-H{pi+p2-2pip2), (68) 

where (|63] l and f64\ follow using the chain rule for entropy, (|65l l follows using ( |43] l, ( |66] l and ( |67] i follow using the chain rule 
for entropy, ( l68T l follows using ( l38T l. (|39] l and (|62] |. Next, we proceed with evaluating the second term on the right hand side 
of dUll 

H{Y\X,V) = H{X,Y,V)-H{X,V), (69) 

= H{X,V\Y)+H{Y)-H{X,V), (70) 

= H{X\Y)+H{V\Y)+H{Y)-H{V\X)-H{X), (71) 

= H{X,Y) + H{V\Y) - H{V\X) - H{X), (72) 

= + (73) 

= H{pi) + H{p2)-H{pi+p2-2p^p2), (74) 

where (|69] l and ( fTOl i follow using the chain rule for entropy, (ItTI i follows using ( |44] |. ( |72] i and ( |73] l follow using the chain rule 
for entropy, (EHi follows using (|39]l and (l62T i. 
Using (I62li, (|68li and (|74li in dlDl yields 

= H{p,)+H{p2) - H{pi +P2- 2pip2). (75) 
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Next, using dTST l in ( |50] l yields: 

I{U;Y) = H{V)-H{pi+p2^2p,p2), 

< l-H{pi+p2{l-2p,)), (76) 

where equality in ( |76] l is achieved if and only if V is Bernoulli 1/2, which is the case if and only if X is Bernoulli 1/2. This 
concludes the proof of theorem. ■ 

Remark 3.6: Straightforward algebra reveals that, the capacity of the binary symmetric case, C (pi,p2) — l~-ff(pi+P2 —2pip2), 
is symmetric both around the line pi = 1/2 and p2 = 1/2, i.e., 

C (pi,P2) = C (1 - pi,P2) - C (pi, 1 - P2) = C (1 -pi, 1 -P2) , 

which is quite intuitive due to the symmetric structure of the setup. Hence, w.l.o.g., we can assume that < pi,P2 < 1/2. 
Remark 3.7: Note that for < p2,Pi < 1/2, we have 

Pi <Pi+P2(l-2pi) < (1-pi), (77) 

where the first inequality follows since 1 — 2pi > and the second inequality follows since 

[pi+P2(l-2pi) < (1-pi)] ^ [2P1(1-P2)<(1-P2)], 

^ [1/2 > Pi]. 

Now, since the binary entropy function is monotonic increasing (resp. decreasing) for < p < 1/2 (resp. 1/2 < p < 1) and 
is symmetric around p = 1/2, we have 

l-i/(pi+P2(l-2pi))<l-iJ(pi), (78) 

where the inequality holds with equality if and only if p2 = and/or pi = 1/2. Recall that, the right hand side of dTSl l is 
the Shannon-capacity for binary symmetric channel with binary alphabets. Hence, a significant consequence of ( iTSl l is that, 
the capacity of the binary symmetric case of asymmetric codebooks setup (cf. ( |40] |) is strictly less than the "original Shannon 
type" counterpart (which is a special case of the setup at hand with p2 = 0) for < P2,Pi < 1/2. Note that, the case of 
Pi — 1/2 corresponds to the "information erasure" case for the communication channel; hence it is not possible to transmit 
any information reliably neither in our setup nor in the classical Shannon setup. Next, the case of p2 = 1/2 yields the capacity 
of 1 — H{pi +P2(1 — 2pi))|p^^j^^2 = 1 ^ ^(1/2) = regardless of the value of pi, which is also obvious, since this case 
corresponds to the case of "perfectly asymmetric codebooks"; in this case, it is not possible to transmit any information due to 
the independence of the codewords of Cx and Cjj', as a result, the codewords {u" (W^)}n/£w ^'^'^ communication channel 
output Y" are independent. 
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Fig. 3. Capacity results of reliable communications with asymmetric codebooks in the binary symmetric case; pi and p2 denote the crossover probabilities of the 
communication channel and the codebook perturbation distribution, respectively. The asymmetric codebook capacity, C (pi, P2) = l — H (pi + P2 — 2pip2), 
is shown as a function of pi (resp. p2) in panel (a) (resp. panel (c)). The gap between the aforementioned asymmetric codebook capacity and the classical 
Shannon capacity, Cgf^annon = 1 — H (pi), which is given by H (pi + P2 — 2piP2) — H (pi) is shown as a function of pi (resp. P2) in panel (b) (resp. 
panel (d)). 



Numerical Results: In Fig. [3] we show the numerical capacity results for the binary symmetric case. Specifically, in Fig. |3ja) 
and Fig. Oc), we plot C {pi,p2) = 1 — H {pi + p2 — 2pip2 ) as functions of the communications channel crossover probability, 
Pi, and the codebook perturbation distribution crossover probability, p2, respectively. From Fig. |3la), we see that for any given 
< P2 < 1/2, C {pi,P2) is monotonically decreasing in pi since the communication channel becomes more noisy. Also, note 
that, these two figures show exactly the same behavior due to symmetry: C {pi,p2) ~ C {p2,Pi), which also implies that the 
same monotonic behavior of C (pi,P2) with respect to p2 for any fixed < pi < 1/2 holds. On the other hand, another quantity 
of interest is "the capacity loss" due to the asymmetry between the codebooks, which is given by, C {pi,P2) — C shannon, where 
C Shannon = C {pi,p2 — 0) — 1 — H {pi) is the Capacity of the classical binary symmetric channel setup. The capacity loss is 
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depicted as a function of pi (resp. P2) in Fig. [3lb) (resp. Fig. |3jd)). From Fig. |3jb), we see that for any given < p2 < 1/2, 
the capacity loss is monotonically decreasing in pi, which eventually diminishes to for the case of pi = 1/2 when no reliable 
transmission of information is possible both in the asymmetric and the symmetric case. From Fig. lUd), we see that for any 
given < pi < 1/2, the capacity loss is monotonically increasing in p2, which is also obvious since as p2 increases, the 
asymmetry between the codebooks increases, thereby increasing the capacity loss. 



In this paper, we introduced a new concept of reliable communications with asymmetric codebooks, which is a generalization 
of the classical point-to-point communication setup due to Shannon. In particular, we establish a channel coding theorem for the 
special case of a discrete memoryless channel with i.i.d asymmetric codebooks. We also quantify exact information capacity 
results for this communication system when the channel encoder codewords, decoder codewords are drawn from a binary 
alphabet, and the communication channel, perturbation distribution (causing the asymmetry) are analogous to the classical 
binary symmetric channel. Our set up is inspired by and serves as the information theoretic basis for the analysis of robust 
signal hashing where the asymmetry is due to the fact that the receiver only has access to hash values of the content owned 
by the transmitter. We acknowledge that the assumption of i.i.d codewords and a memoryless channel as well as codebook 
perturbations are not in general true for robust hashing, and aim to address those in future research. The proposed work is 
indeed meant to serve as a first step in information theoretic treatment of signal hashing. 
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Appendix I 
Proof of Lemma [3711 



We have 



P(y",u") 



p(y",u",x"), 



p(y"|u",x")p(u"|x")p(x"), 



Y P(y"|x")p(u"|x")p(x"), 



(I-l) 



n 




(1-2) 



vi^ex^ i=i 
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= I ^ P iyi\xi) P iui\xi) p (Xi) \ . . . i ^ p {yn\Xn) P iUn\Xn) P (Xn) \ 

) \x^i^X ) 

n 

= n X! p^y^Vi)p^^^Vi)p'y^i) ^ 

1=1 XiSA" 
n 

where (II-ll l follows from (|6]l, ( II-2l l follows using memoryless property of the communication channel and (|2]l, and (II-3I I follows 
by using the definition p{y,u) — J2x£X Piy\x)pi'^\x)pix)- Hence, the sought after result follows. □ 

Appendix II 
Proof of Lemma [3T2I 

First, note that we have 

pr(cx) = nnp(^'H), (n-i) 

i—l w—1 
n 2"-" 

Py{Cu\Cx) = l[l[p{uAw)\x,{w)), (11-2) 

i=l w=l 

where (III- 11 1 follows from the definition of the encoder's codebook and (III-2I) follows from ([T]| and (|2]l. 
Combining ( III- II ) and ( III-2I ) yields: 

n 2"« 

Pr(C£/) = p(a;i (uj) |a;j (w)) , 

Cx 1=1 w=l 

= Yl p(2;i(1))pK(1)Ni(1))..... Y1 P{xn{2-'''))p{un{2^''')\xn{r"')), 

xi(l)£X Xn(2"^)eX 
2"-" n 

^ II II X! P{x^{w))p{u,{w)\x^{w)) , 
w=l 1=1 Xi (iu)^X 
2"« n 

= nnp("*H), (n-3) 

10 = 1 2=1 

where p {ui [w)) = X^x (lii)eA'P (^)) P ("i (^) l^^i (^)) in the statement of the lemma. Hence, (|II-31 l is the desired result, 
which concludes the proof. □ 

Appendix III 
Proof of Lemma[J3] 



Proof follows by contradiction: Suppose that there exists a (2"^,n) asymmetric code, say [X ^CxtP{u\x) ^U,Cu), with 
some decoding function g (•) such that A^"^ and h (■) is not one-to-one. Next, we note the following fact: The statement 
of "/i (•) is not one-to-one" equivalently means that there exists at least one set of m > 2 messages, W = {wiY^^^ C W, such 
that e yy, we have 

= u"(zl},) =c" (III-l) 
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for some constant (in Wi) length-n vector c" G W". Here, let Ai and A^"^ denote the conditional probability of error and 
maximal probability of error corresponding to the code {X ,Cx ,P {u\x) ,U,Cu) with the decoding function g (•), respectively. 

Next, consider another asymmetric code, {X,Cx,p{u\x) ,U,Cu)Mjip, which is "derived" from {X ,Cx ,P {u\x) ,U,Cu) in 
the following way: {X,Cx,p{u\x) ,U,Cu)map '■^^ same as {X ,Cx ,P {u\x) ,h(,Cu), except for the decoding function; the 
code {X, Cx,P {ulx) ,U, Cu)]^ij^p employs the MAP (maximum a-posterori) decoding rule [12] for the given codebooks and the 
communications channel, and the corresponding decoding function is denoted by gMAP ( ) (which is potentially different from 
the decoding function g{-) employed by [X ,Cx,p{u\x) ,U,Cif)). Let Xi,MAP and A^"^p denote the conditional probability 
of eiTor and maximal probability of eiTor corresponding to the code {X,Cx,p{u\x) ,U,Cu) j^j^p with the decoding function 
9 MAP {■), respectively. 

Now, recalling the definition of the MAP decoder [12], we conclude that under the assumption of uniform cost and priors 
the MAP decoder minimizes the conditional probability of error given the codebooks and the communication channel. Therefore, 
we have 

Vi G W, A, > \map. (III-2) 
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Hence, (IIII-2b implies that 



A^"-* = max Ai > X^map = maxAi map- (111-3) 



Also, for all Wi G VV, and for any channel input x" and any channel output y", note that we have 

p{y^\^h{W = Wi) = C") = p (y" I (VF = U)2) = C") = . . . = p (y" \hiW^Wm)^ c") , (III-4) 

by using ( IIII-ll l. Next, denoting the decoder output by W, note that the MAP decoding rule is given by 

W — arg max p ( y" u" (w) ] , 

wew \ I 

which is not necessarily unique in general. In particular, if we have 

p(y" c")=maxp(y" u" h) , (III-5) 

any element w' G VV is a maximizer (cf. (IIII-4l l). Let y^(y") denote some (potentially randomized) MAP decision rule if 
( IIII-5I) holds. Note that, any MAP decision rule should necessarily apply some mapping A (•), of which range is W, if ( IIII-5I ) 
holds. 

Now, suppose some w' G VV has been transmitted and the (2"^,7i) asymmetric code, {^X ,Cx,p(u\x^ ^U^Cu) j^jj^p, with 
the decoding function gMAP (•) (which incorporates some mapping A : 3^" W), is applied. Then, for any w' G W, we 

'"The assumption of uniform priors follows from our problem definition; the assumption of uniform costs follows from the fact that we are interested in 
the case which minimizes the probability of error, i.e., Minimum Probability of Error (MPE) rule, which is equivalent to the MAP rule under the specified 
assumptions. 
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have 



1 ^ ^w' . 



MAP 



= ¥v W = w' W ^w' 



< Pr 
= Pr 



W = w' 



p(y" I c") = maxp(y"|u" (w)) , W ^ w' 



p(y"|c") = maxp(y"|u" (w)) 
wew 



Pr [A{r^)^w'] 



(111-6) 
(111-7) 
(III-8) 
(111-9) 



where ( IIII-6I 1 follows from the definition of {A^,/ map}. ( |III-7| l follows since the event p (y" | c") = max^gwP (y"|u" (w)) 
is a necessary condition for the event W = w' for the case of MAP decoding, (IIII-8I 1 follows since max^gwp (y"|u" (w)) is 
a sufficient statistic for MAP decoding, ( IIII-9I ) follows from the definition of the MAP decoding rule qmap (•) and the utilized 
mapping A{-). Hence, using ( IIII-9I ), for any G W we have 



which, in turn, imphes 



Xw',MAP > 1-Pr[^(y") = 



max Xw'.MAP > 1 - miri Pr [A (y") = w'] 
w'ew 'w'ew 



(III-IO) 



A 



Next, upon defining q {w') = Pr [A (y") = w'], we note that {q (w')}^,^^^ is a valid p.m.f. over the discrete finite set W, 
i.e., \/w' G yy, q {w') G [0, 1] and J2w'ew 1 (^') ~ ^- Therefore, we clearly have 

1 1 



max min q {w') ~ 



m 



which implies 



Thus, using dnPTTT i in dnTTOb 



Consequently, we have 



min Pr[^(y") = w'] < — . 

w'ew m 



max X^u'MAP > 1 ■ 

w'ew m 



A^") > X^!t\p > max Xw'.map > 1 - — > ^ > 



(III-ll) 



(III- 12) 



(III-13) 



w'ew m 2 

where the first inequality follows from (|III-3l l, the second inequality follows since W C W, the third inequality follows from 
( IIII-12b . the fourth inequality follows since m > 2. Hence, the promised contradiction follows from ( IIII-13I ). □ 
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Appendix IV 
Proof of Lemma [3741 



First of all, observe that we have U ^ V ® Zi, and Y = V ® Z2 (cf. ^ and ( [39] l) 



Pr{U = u\V = v) = Pr{Zi=u®v), 
PT{Y^y\V^v) ^ Pr(Z2 = 2;ew), 



(IV-1) 
(IV-2) 



for all u,v,y ^ {0, 1}. Furthermore, we also have 



Pi {U = u,Y =^ y\V = v) = 



PT{Zi=u(Bv,Z2 = y®v), 
Pr (Zi = u ® v) Pr (Z2 = y © , 



(IV-3) 
(IV-4) 



for all u,v,y £ {0,1}, where ( IIV-31 ) follows using dJST l and ( [39] ), and ( IIV-4l i follows since Zi and Z2 are independent. 
Combining jlV-ll l, ( IIV-21 ) and ( |IV-4| |, we conclude that y\v) — p{u\v)p{y\v), which is the sought-after result. □ 
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